-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Wappalyzer #3800
Update Wappalyzer #3800
Conversation
…update-wappalyzer
Can confirm that this still generates software and softwareinstance oois, but cannot confirm whether there are more or less of those now. |
Observed the following two things:
webpageanalysis port 443
RetireJS port 443
|
The pending findings should trigger Boefje jobs for those findings against the various hydration boefjes, not sure why they have not ran, but unlikely to be related to this PR. It looks like Bytes was unable to store the raw files for both the boefjes runs. Not sure why, If rescheduling worked, Bytes might have had issues with reaching one of its underlying services (rabbitmq or postgres?), this would also be unlikely to be related to this PR. |
# Conflicts: # boefjes/boefjes/plugins/kat_webpage_analysis/main.py
…update-wappalyzer
# Conflicts: # boefjes/poetry.lock # boefjes/pyproject.toml # boefjes/requirements-dev.txt # boefjes/requirements.txt
This PR removes the only python package that is installed from git, so the workaround for pip to make this work is no longer necessary. Because grep returns a non-zero exit status if there is no match the Debian packaging results in an error. In the dockerfile this isn't a problem because we directly pipe to pip so the grep exit status is ignored, but we should also remove it there because it is no longer necessary. We should also check if the new technology files are shipped in the Debian package. |
…update-wappalyzer
…update-wappalyzer # Conflicts: # boefjes/Dockerfile
Checklist for QA:
What works:The normalizer runs and checks for software on the main page. @ammar92 added inline script tags, and updated the technology files and it also works with the current version of the technology files, it now also produces HAR files. The bugs below are known, but as this PR is a big improvement from what is currently on main I suggest we merge this, as maintaining this is quite annoying. :) What doesn't work:See below. Bug or feature?:The current version on main (and this PR) both do not check for HTTP resources on pages (e.g. /js/core.js is not analysed and picked up, while this is where often software (versions) can be found. The very old version of wappalyzer did analyse these files, but did not create proof. This will be picked up in a new PR, as it also requires some discussion on how pages are crawled/analysed and how external resources are handled. #4020
|
Quality Gate failedFailed conditions |
Changes
This update replaces the outdated and archived Wappalyzer dependency with a more current fork. It also implements a script that downloads an updated technologies file from this, a project that aims to maintain Wappalyzer technologies files. For now the technology file should be updated manually from time to time, but we should work towards a plans for an automated way of doing this.
Issue link
Closes #3037
QA notes
Run the Wappalyzer normalizer and see if it still produces
Software
andSoftwareInstance
OOIs.Code Checklist
.env
changes files if required and changed the.env-dist
accordingly.Checklist for code reviewers:
Copy-paste the checklist from the docs/source/templates folder into your comment.
Checklist for QA:
Copy-paste the checklist from the docs/source/templates folder into your comment.